Search CORE

22 research outputs found

Accelerating MCMC via Parallel Predictive Prefetching

Author: Adams Ryan P.
Angelino Elaine
Kohler Eddie
Seltzer Margo
Waterland Amos
Publication venue
Publication date: 27/03/2014
Field of study

We present a general framework for accelerating a large class of widely used Markov chain Monte Carlo (MCMC) algorithms. Our approach exploits fast, iterative approximations to the target density to speculatively evaluate many potential future steps of the chain in parallel. The approach can accelerate computation of the target distribution of a Bayesian inference problem, without compromising exactness, by exploiting subsets of data. It takes advantage of whatever parallel resources are available, but produces results exactly equivalent to standard serial execution. In the initial burn-in phase of chain evaluation, it achieves speedup over serial evaluation that is close to linear in the number of available cores

arXiv.org e-Print Archive

CiteSeerX

Recommended from our members

Parallelization by Simulated Tunneling

Author: Appavoo Jonathan
Seltzer Margo I.
Waterland Amos
Publication venue: USENIX Association
Publication date: 26/11/2012
Field of study

As highly parallel heterogeneous computers become commonplace, automatic parallelization of software is an increasingly critical unsolved problem. Continued progress on this problem will require large quantities of information about the runtime structure of sequential programs to be stored and reasoned about. Manually formalizing all this information through traditional approaches, which rely on semantic analysis at the language or instruction level, has historically proved challenging. We take a lower level approach, eschewing semantic analysis and instead modeling von Neumann computation as a dynamical system, i.e., a state space and an evolution rule, which gives a natural way to use probabilistic inference to automatically learn powerful representations of this information. This model enables a promising new approach to automatic parallelization, in which probability distributions empirically learned over the state space are used to guide speculative solvers. We describe a prototype virtual machine that uses this model of computation to automatically achieve linear speedups for an important class of deterministic, sequential Intel binary programs through statistical machine learning and a speculative, generalized form of memoization.Engineering and Applied Science

Harvard University - DASH

Programmable smart machines

Author: Appavoo Jonathan
Schatzberg Dan
Waterland Amos
Publication venue: Computer Science Department, Boston University
Publication date: 15/04/2012
Field of study

In this paper we conjecture that a system can be constructed that exploits the general ability to learn through the counting, correlating, and memorizing of occurrences of events to fast-forward a programmable computer. In particular, we propose a signal based interpretation of a computer's execution that can be used to implement a form of system state memoization using a predictive associative memory. Such an approach may some day lead to a system that can utilize both traditional logic and neuromorphic or other biologically inspired mechanisms to be both programmable and smart.Department of Energy Office of Science (DE-SC0005365), National Science Foundation (1012798

Boston University Institutional Repository (OpenBU)

Recommended from our members

Towards General-Purpose Neural Network Computing

Author: Appavoo Jonathan
Eldridge Schuyler
Joshi Ajay
Seltzer Margo I.
Waterland Amos
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 15/03/2017
Field of study

Machine learning is becoming pervasive, decades of research in neural network computation is now being leveraged to learn patterns in data and perform computations that are difficult to express using standard programming approaches. Recent work has demonstrated that custom hardware accelerators for neural network processing can outperform software implementations in both performance and power consumption. However, there is neither an agreed-upon interface to neural network accelerators nor a consensus on neural network hardware implementations. We present a generic set of software/hardware extensions, X-FILES, that allow for the general-purpose integration of feedforward and feedback neural network computation in applications. The interface is independent of the network type, configuration, and implementation. Using these proposed extensions, we demonstrate and evaluate an example dynamically allocated, multi-context neural network accelerator architecture, DANA. We show that the combination of X-FILES and our hardware prototype, DANA, enables generic support and increased throughput for neural-network-based computation in multi-threaded scenarios.Engineering and Applied Science

Harvard University - DASH

Providing a Cloud Network Infrastructure on a Supercomputer

Author: Appavoo Jonathan
Da Silva Dilma
Rosenburg Bryan
Steinberg Udo
Stoess Jan
Uhlig Volkmar
Van Hensbergen Eric
Waterland Amos
Wisniewski Robert
Publication venue: Association for Computing Machinery
Publication date: 01/01/2010
Field of study

Crossref

KITopen

Recommended from our members

ASC: Automatically Scalable Computation

Author: Adams Ryan Prescott
Angelino Elaine
Appavoo Jonathan
Seltzer Margo I.
Waterland Amos
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 30/10/2017
Field of study

We present an architecture designed to transparently and automatically scale the performance of sequential programs as a function of the hardware resources available. The architecture is predicated on a model of computation that views program execution as a walk through the enormous state space composed of the memory and registers of a single-threaded processor. Each instruction execution in this model moves the system from its current point in state space to a deterministic subsequent point. We can parallelize such execution by predictively partitioning the complete path and speculatively executing each partition in parallel. Accurately partitioning the path is a challenging prediction problem. We have implemented our system using a functional simulator that emulates the x86 instruction set, including a collection of state predictors and a mechanism for speculatively executing threads that explore potential states along the execution path. While the overhead of our simulation makes it impractical to measure speedup relative to native x86 execution, experiments on three benchmarks show scalability of up to a factor of 256 on a 1024 core machine when executing unmodified sequential programs.Engineering and Applied Science

Harvard University - DASH

A Light-Weight Virtual Machine Monitor for Blue Gene/P

Author: Appavoo Jonathan
Kehne Jens
Steinberg Udo
Stoess Jan
Uhlig Volkmar
Waterland Amos
Publication venue: Association for Computing Machinery
Publication date: 01/01/2011
Field of study

KITopen

Recommended from our members

Programmable Smart Machines: A Hybrid Neuromorphic approach to General Purpose Computation

Author: Appavoo Jonathan
Eldridge Schuyler
Homer Steve
Joshi Ajay
Seltzer Margo I.
Waterland Amos
Zhao Katherine
Publication venue
Publication date: 21/09/2015
Field of study

Engineering and Applied Science

Harvard University - DASH

Recommended from our members

Computational Caches

Author: Adams Ryan Prescott
Angelino Elaine Lee
Appavoo Jonathan
Cubuk Ekin Dogus
Kaxiras Efthimios
Seltzer Margo I.
Waterland Amos
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 20/09/2017
Field of study

Caching is a well-known technique for speeding up computation. We cache data from file systems and databases; we cache dynamically generated code blocks; we cache page translations in TLBs. We propose to cache the act of computation, so that we can apply it later and in different contexts. We use a state-space model of computation to support such caching, involving two interrelated parts: speculatively memoized predicted/resultant state pairs that we use to accelerate sequential computation, and trained probabilistic models that we use to generate predicted states from which to speculatively execute. The key techniques that make this approach feasible are designing probabilistic models that automatically focus on regions of program execution state space in which prediction is tractable and identifying state space equivalence classes so that predictions need not be exact.Engineering and Applied Science

Harvard University - DASH